{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# OPENAI 新的 embedding models\n",
    "  一种是小型且高效的 **text-embedding-3-small** 模型，另一种是大型且更强大的**text-embedding-3-large**模型。\n",
    "\n",
    "  小型模型虽然运行更快、需要的计算资源更少，但可能在捕捉复杂语义信息方面不如大型模型强大。大型模型虽然需要更多的计算资源和时间，但能更准确地理解文本的深层意义。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "label_sentences = [\n",
    "    # 数学\n",
    "    (\"请解释什么是费马大定理。\", \"数学\"),\n",
    "    (\"二次方程有哪些解法？\", \"数学\"),\n",
    "    (\"如何理解欧拉公式？\", \"数学\"),\n",
    "    (\"请解释一下毕达哥拉斯定理。\", \"数学\"),\n",
    "    (\"如何计算圆的面积？\", \"数学\"),\n",
    "    \n",
    "    ##生物\n",
    "    (\"DNA和RNA有什么区别？\", \"生物\"),\n",
    "    (\"什么是光合作用？\", \"生物\"),\n",
    "    (\"细胞分裂有哪些类型？\", \"生物\"),\n",
    "    (\"遗传学中的孟德尔定律是什么？\", \"生物\"),\n",
    "    (\"什么是生态系统？\", \"生物\"),\n",
    "    \n",
    "    ## 体育\n",
    "    (\"网球比赛的计分系统是如何工作的？\", \"体育\"),\n",
    "    (\"足球的基本规则是什么？\", \"体育\"),\n",
    "    (\"如何在篮球比赛中扣篮？\", \"体育\"),\n",
    "    (\"自由泳的理想技巧是什么？\", \"体育\"),\n",
    "    (\"一个人如何提高跑步速度？\", \"体育\"),\n",
    "    \n",
    "    ## 闲聊\n",
    "    (\"最近天气怎么样？\", \"闲聊\"),\n",
    "    (\"你最喜欢的书籍是什么？\", \"闲聊\"),\n",
    "    (\"你最喜欢的歌手是谁？\", \"闲聊\"),\n",
    "    (\"这周你有什么旅行计划？\", \"闲聊\"),\n",
    "    (\"你昨天吃的什么菜？\", \"闲聊\"),  ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "X, y = zip(*label_sentences)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_data = [\n",
    "    # 数学\n",
    "    \n",
    "    (\"如何证明费马小定理？\", \"数学\"),\n",
    "    (\"三次方程解法有哪些？\", \"数学\"),\n",
    "    (\"复数的概念是什么？\", \"数学\"),\n",
    "    (\"余弦定理是如何推导的？\", \"数学\"),\n",
    "    (\"矩形的面积计算方法是什么？\", \"数学\"),\n",
    "    (\"什么是无理数？\", \"数学\"),\n",
    "    (\"如何使用向量解决几何问题？\", \"数学\"),\n",
    "    (\"椭圆的标准方程是什么？\", \"数学\"),\n",
    "    (\"什么是随机变量的期望值？\", \"数学\"),\n",
    "    (\"如何求解线性不等式组？\", \"数学\"),\n",
    "    (\"实数和虚数之间有什么区别？\", \"数学\"),\n",
    "    (\"线性代数中的矩阵乘法如何运算？\", \"数学\"),\n",
    "    (\"直角三角形中的正弦、余弦、正切是如何定义的？\", \"数学\"),\n",
    "    # 生物\n",
    "    (\"蛋白质的功能有哪些？\", \"生物\"),\n",
    "    (\"细胞的能量是如何产生的？\", \"生物\"),\n",
    "    (\"动物细胞和植物细胞的区别是什么？\", \"生物\"),\n",
    "    (\"基因突变对生物有什么影响？\", \"生物\"),\n",
    "    (\"什么是生物进化论？\", \"生物\"),\n",
    "    (\"病毒与细菌有何不同？\", \"生物\"),\n",
    "    (\"克隆技术是如何工作的？\", \"生物\"),\n",
    "    (\"激素在人体中的作用是什么？\", \"生物\"),\n",
    "    (\"什么是生物群落？\", \"生物\"),\n",
    "    (\"人体免疫系统是如何工作的？\", \"生物\"),\n",
    "    (\"光合作用与呼吸作用的关系是什么？\", \"生物\"),\n",
    "    (\"DNA复制的过程是怎样的？\", \"生物\"),\n",
    "    (\"脊椎动物和无脊椎动物的主要区别是什么？\", \"生物\"),\n",
    "    (\"什么是生物技术？\", \"生物\"),\n",
    "    (\"植物如何进行水分和养分的吸收？\", \"生物\"),\n",
    "    # 体育\n",
    "    (\"乒乓球比赛的规则是什么？\", \"体育\"),\n",
    "    (\"马拉松比赛的历史由来是什么？\", \"体育\"),\n",
    "    (\"体操比赛中的评分标准是什么？\", \"体育\"),\n",
    "    (\"什么是健身运动的基本原则？\", \"体育\"),\n",
    "    (\"羽毛球比赛规则有哪些？\", \"体育\"),\n",
    "    (\"如何训练提高游泳速度？\", \"体育\"),\n",
    "    (\"跳远的技术要领是什么？\", \"体育\"),\n",
    "    (\"篮球运动中的战术布置有哪些？\", \"体育\"),\n",
    "    (\"足球比赛中的禁区规则是什么？\", \"体育\"),\n",
    "    (\"铁人三项比赛包括哪些内容？\", \"体育\"),\n",
    "    (\"现代奥运会的起源是什么？\", \"体育\"),\n",
    "    (\"高尔夫球的基本打法有哪些？\", \"体育\"),\n",
    "    (\"什么是体能训练的核心原则？\", \"体育\"),\n",
    "    (\"冰球比赛的基本规则是什么？\", \"体育\"),\n",
    "    (\"如何正确地进行力量训练？\", \"体育\"),\n",
    "    # 闲聊\n",
    "    (\"今年流行什么样的音乐？\", \"闲聊\"),\n",
    "    (\"推荐一些休闲阅读的书籍。\", \"闲聊\"),\n",
    "    (\"有什么好看的电视剧推荐\", \"闲聊\"),\n",
    "    (\"周末有什么好的出游建议？\", \"闲聊\"),\n",
    "    (\"最近有什么新的科技产品吗？\", \"闲聊\"),\n",
    "    (\"你平时喜欢什么样的运动？\", \"闲聊\"),\n",
    "    (\"有没有什么健康饮食的建议？\", \"闲聊\"),\n",
    "    (\"如何有效管理时间？\", \"闲聊\"),\n",
    "    (\"有什么关于旅行的节省费用的技巧？\", \"闲聊\"),\n",
    "    (\"如何选择一本好书？\", \"闲聊\"),\n",
    "    (\"学习外语的最佳方法是什么？\", \"闲聊\"),\n",
    "    (\"怎样培养良好的阅读习惯？\", \"闲聊\"),\n",
    "    (\"有什么推荐的户外活动？\", \"闲聊\"),\n",
    "    (\"最近有什么值得关注的科技发展？\", \"闲聊\"),\n",
    "    (\"如何有效减压放松？\", \"闲聊\"),\n",
    "    (\"有没有好的自我提升的书籍推荐？\", \"闲聊\")\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 使用m3e"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "c:\\Users\\blackink\\.conda\\envs\\langchain\\lib\\site-packages\\tqdm\\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "moka-ai/m3e-base 的维度是768\n"
     ]
    }
   ],
   "source": [
    "from sentence_transformers import SentenceTransformer\n",
    "from tqdm import tqdm\n",
    "\n",
    "\n",
    "model_name = 'moka-ai/m3e-base'\n",
    "model = SentenceTransformer(model_name)\n",
    "\n",
    "#编码label-sentence\n",
    "embeddings_m3e = model.encode(X)\n",
    "\n",
    "print(f'{model_name} 的维度是{len(embeddings_m3e[0])}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from cos_similarity import similarity_matrix , top_scores"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_query, label_class = zip(*test_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 59/59 [00:03<00:00, 18.42it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "M3E Accuracy: 67.80%\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "correct = 0\n",
    "for x_q, target_class in tqdm(zip(X_query, label_class), total=len(X_query)):\n",
    "    #编码query语句\n",
    "    embeddings_query = model.encode(x_q)\n",
    "    #计算相似度\n",
    "    sim_score = similarity_matrix(embeddings_query , embeddings_m3e)\n",
    "    #获取topk 索引、得分\n",
    "    topk_score , topk_idx = top_scores(sim_score)\n",
    "    is_present = all(y[index] == target_class for index in topk_idx)\n",
    "    if is_present: \n",
    "        correct += 1\n",
    "        \n",
    "accuracy = correct / len(X_query)         \n",
    "print(f\"M3E Accuracy: {accuracy*100:.2f}%\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 使用OPENAI embedding model  计算相似度准确率"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "import getpass\n",
    "import os\n",
    "import numpy  as np\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
    "\n",
    "from langchain.embeddings.openai import OpenAIEmbeddings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_accuracy(model_name , X_query, label_class):\n",
    "    embeddings_model = OpenAIEmbeddings(model=model_name ,dimensions= 256)\n",
    "    # 编码标记库sentence\n",
    "    label_result = embeddings_model.embed_documents(X)\n",
    "    label_result = np.array(label_result)\n",
    "    print(f'{model_name} 的维度是{label_result.shape[1]}')\n",
    "    \n",
    "    correct = 0\n",
    "    for x_q, target_class in tqdm(zip(X_query, label_class), total=len(X_query)):\n",
    "        #编码query语句\n",
    "        embeddings_query = embeddings_model.embed_query(x_q)\n",
    "        embeddings_query = np.array(embeddings_query)\n",
    "        #计算相似度\n",
    "        sim_score = similarity_matrix(embeddings_query , label_result)\n",
    "        #获取topk 索引、得分\n",
    "        topk_score , topk_idx = top_scores(sim_score)\n",
    "        is_present = all(y[index] == target_class for index in topk_idx)\n",
    "        if is_present: \n",
    "            correct += 1\n",
    "            \n",
    "    accuracy = correct / len(X_query)         \n",
    "    print(f\"{model_name}: {accuracy*100:.2f}%\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### text-embedding-ada-002"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "text-embedding-ada-002 的维度是1536\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 59/59 [01:14<00:00,  1.27s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "text-embedding-ada-002: 54.24%\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "get_accuracy(\"text-embedding-ada-002\" , X_query, label_class)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### text-embedding-3-large"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Warning: model not found. Using cl100k_base encoding.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "text-embedding-3-large 的维度是256\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "  0%|          | 0/59 [00:00<?, ?it/s]Warning: model not found. Using cl100k_base encoding.\n",
      "  2%|▏         | 1/59 [00:00<00:57,  1.00it/s]Warning: model not found. Using cl100k_base encoding.\n",
      "  3%|▎         | 2/59 [00:01<00:40,  1.42it/s]Warning: model not found. Using cl100k_base encoding.\n",
      "  5%|▌         | 3/59 [00:05<01:57,  2.10s/it]Warning: model not found. Using cl100k_base encoding.\n",
      "  7%|▋         | 4/59 [00:06<01:29,  1.63s/it]Warning: model not found. Using cl100k_base encoding.\n",
      "  8%|▊         | 5/59 [00:06<01:05,  1.21s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 10%|█         | 6/59 [00:08<01:09,  1.30s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 12%|█▏        | 7/59 [00:08<00:53,  1.03s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 14%|█▎        | 8/59 [00:10<01:00,  1.19s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 15%|█▌        | 9/59 [00:10<00:47,  1.04it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 17%|█▋        | 10/59 [00:11<00:39,  1.23it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 19%|█▊        | 11/59 [00:14<01:18,  1.64s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 20%|██        | 12/59 [00:15<01:00,  1.30s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 22%|██▏       | 13/59 [00:15<00:49,  1.08s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 24%|██▎       | 14/59 [00:17<00:56,  1.26s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 25%|██▌       | 15/59 [00:18<00:59,  1.34s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 27%|██▋       | 16/59 [00:19<00:47,  1.11s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 29%|██▉       | 17/59 [00:20<00:44,  1.05s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 31%|███       | 18/59 [00:20<00:35,  1.15it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 32%|███▏      | 19/59 [00:21<00:30,  1.33it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 34%|███▍      | 20/59 [00:21<00:27,  1.44it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 36%|███▌      | 21/59 [00:22<00:23,  1.59it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 37%|███▋      | 22/59 [00:22<00:21,  1.71it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 39%|███▉      | 23/59 [00:23<00:19,  1.82it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 41%|████      | 24/59 [00:24<00:22,  1.53it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 42%|████▏     | 25/59 [00:24<00:21,  1.59it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 44%|████▍     | 26/59 [00:25<00:19,  1.71it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 46%|████▌     | 27/59 [00:25<00:18,  1.74it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 47%|████▋     | 28/59 [00:26<00:20,  1.48it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 49%|████▉     | 29/59 [00:27<00:19,  1.54it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 51%|█████     | 30/59 [00:27<00:18,  1.59it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 53%|█████▎    | 31/59 [00:28<00:20,  1.35it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 54%|█████▍    | 32/59 [00:29<00:17,  1.51it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 56%|█████▌    | 33/59 [00:29<00:15,  1.66it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 58%|█████▊    | 34/59 [00:30<00:14,  1.77it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 59%|█████▉    | 35/59 [00:30<00:13,  1.77it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 61%|██████    | 36/59 [00:31<00:12,  1.84it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 63%|██████▎   | 37/59 [00:31<00:11,  1.91it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 64%|██████▍   | 38/59 [00:33<00:16,  1.24it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 66%|██████▌   | 39/59 [00:33<00:14,  1.42it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 68%|██████▊   | 40/59 [00:34<00:12,  1.57it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 69%|██████▉   | 41/59 [00:35<00:13,  1.36it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 71%|███████   | 42/59 [00:35<00:11,  1.45it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 73%|███████▎  | 43/59 [00:37<00:14,  1.11it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 75%|███████▍  | 44/59 [00:37<00:12,  1.24it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 76%|███████▋  | 45/59 [00:38<00:11,  1.20it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 78%|███████▊  | 46/59 [00:40<00:13,  1.02s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 80%|███████▉  | 47/59 [00:40<00:10,  1.13it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 81%|████████▏ | 48/59 [00:41<00:08,  1.31it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 83%|████████▎ | 49/59 [00:42<00:09,  1.05it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 85%|████████▍ | 50/59 [00:43<00:07,  1.24it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 86%|████████▋ | 51/59 [00:44<00:08,  1.12s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 88%|████████▊ | 52/59 [00:45<00:06,  1.05it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 90%|████████▉ | 53/59 [00:46<00:05,  1.20it/s]Warning: model not found. Using cl100k_base encoding.\n",
      " 92%|█████████▏| 54/59 [00:48<00:06,  1.32s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 93%|█████████▎| 55/59 [01:28<00:52, 13.08s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 95%|█████████▍| 56/59 [01:29<00:28,  9.45s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 97%|█████████▋| 57/59 [01:33<00:15,  7.77s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 98%|█████████▊| 58/59 [01:34<00:05,  5.59s/it]Warning: model not found. Using cl100k_base encoding.\n",
      "100%|██████████| 59/59 [01:34<00:00,  1.61s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "text-embedding-3-large: 59.32%\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "get_accuracy(\"text-embedding-3-large\" , X_query, label_class)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### text-embedding-3-small"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Warning: model not found. Using cl100k_base encoding.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "text-embedding-3-small 的维度是1536\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "  0%|          | 0/59 [00:00<?, ?it/s]Warning: model not found. Using cl100k_base encoding.\n",
      "  2%|▏         | 1/59 [00:00<00:53,  1.08it/s]Warning: model not found. Using cl100k_base encoding.\n",
      "  3%|▎         | 2/59 [00:01<00:54,  1.05it/s]Warning: model not found. Using cl100k_base encoding.\n",
      "  5%|▌         | 3/59 [00:02<00:43,  1.27it/s]Warning: model not found. Using cl100k_base encoding.\n",
      "  7%|▋         | 4/59 [00:05<01:24,  1.54s/it]Warning: model not found. Using cl100k_base encoding.\n",
      "  8%|▊         | 5/59 [00:05<01:04,  1.19s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 10%|█         | 6/59 [00:07<01:18,  1.48s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 12%|█▏        | 7/59 [00:09<01:17,  1.50s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 14%|█▎        | 8/59 [00:11<01:28,  1.73s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 15%|█▌        | 9/59 [00:12<01:21,  1.64s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 17%|█▋        | 10/59 [00:14<01:19,  1.62s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 19%|█▊        | 11/59 [00:15<01:13,  1.54s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 20%|██        | 12/59 [00:17<01:11,  1.51s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 22%|██▏       | 13/59 [00:18<01:07,  1.48s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 24%|██▎       | 14/59 [00:20<01:09,  1.54s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 25%|██▌       | 15/59 [00:22<01:16,  1.75s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 27%|██▋       | 16/59 [00:25<01:23,  1.94s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 29%|██▉       | 17/59 [00:27<01:25,  2.04s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 31%|███       | 18/59 [00:30<01:34,  2.30s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 32%|███▏      | 19/59 [00:31<01:18,  1.96s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 34%|███▍      | 20/59 [00:32<01:05,  1.67s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 36%|███▌      | 21/59 [00:33<00:57,  1.52s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 37%|███▋      | 22/59 [00:34<00:50,  1.37s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 39%|███▉      | 23/59 [00:36<00:58,  1.61s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 41%|████      | 24/59 [00:39<01:09,  1.98s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 42%|████▏     | 25/59 [00:40<00:59,  1.76s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 44%|████▍     | 26/59 [00:41<00:51,  1.55s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 46%|████▌     | 27/59 [00:44<01:01,  1.93s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 47%|████▋     | 28/59 [00:46<01:02,  2.01s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 49%|████▉     | 29/59 [00:48<00:56,  1.87s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 51%|█████     | 30/59 [00:49<00:47,  1.65s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 53%|█████▎    | 31/59 [00:51<00:46,  1.66s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 54%|█████▍    | 32/59 [00:52<00:39,  1.45s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 56%|█████▌    | 33/59 [00:53<00:33,  1.31s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 58%|█████▊    | 34/59 [00:55<00:38,  1.56s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 59%|█████▉    | 35/59 [00:56<00:34,  1.44s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 61%|██████    | 36/59 [01:01<00:59,  2.60s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 63%|██████▎   | 37/59 [01:04<01:00,  2.76s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 64%|██████▍   | 38/59 [01:05<00:46,  2.22s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 66%|██████▌   | 39/59 [01:07<00:39,  1.97s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 68%|██████▊   | 40/59 [01:09<00:39,  2.06s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 69%|██████▉   | 41/59 [01:11<00:38,  2.14s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 71%|███████   | 42/59 [01:13<00:31,  1.85s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 73%|███████▎  | 43/59 [01:14<00:27,  1.71s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 75%|███████▍  | 44/59 [01:15<00:24,  1.61s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 76%|███████▋  | 45/59 [01:16<00:20,  1.43s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 78%|███████▊  | 46/59 [01:17<00:16,  1.29s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 80%|███████▉  | 47/59 [01:19<00:18,  1.53s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 81%|████████▏ | 48/59 [01:22<00:21,  1.93s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 83%|████████▎ | 49/59 [01:24<00:17,  1.79s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 85%|████████▍ | 50/59 [01:26<00:16,  1.79s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 86%|████████▋ | 51/59 [01:28<00:16,  2.11s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 88%|████████▊ | 52/59 [01:31<00:16,  2.33s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 90%|████████▉ | 53/59 [01:33<00:12,  2.15s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 92%|█████████▏| 54/59 [01:36<00:11,  2.28s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 93%|█████████▎| 55/59 [01:37<00:08,  2.17s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 95%|█████████▍| 56/59 [01:39<00:06,  2.01s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 97%|█████████▋| 57/59 [01:40<00:03,  1.75s/it]Warning: model not found. Using cl100k_base encoding.\n",
      " 98%|█████████▊| 58/59 [01:41<00:01,  1.58s/it]Warning: model not found. Using cl100k_base encoding.\n",
      "100%|██████████| 59/59 [01:43<00:00,  1.75s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "text-embedding-3-small: 50.85%\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "get_accuracy(\"text-embedding-3-small\" , X_query, label_class)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "langchain",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
